Thesis Oral Defense - Shiva Kaul July 18, 2024 2:00pm — 4:00pm Location: In Person - Mauldin Auditorium, Newell-Simon 1305 Speaker: SHIVA KAUL, Ph.D. Candidate, Computer Science Department, Carnegie Mellon University https://www.cs.cmu.edu/~skkaul/ Classical Improvements to Modern Machine Learning The following two dilemmas of modern foundation models concern statistical accuracy and computational efficiency, respectively:Can language models be trusted to rigorously answer important scientific questions? (Specifically, causal questions from evidence-based medicine which are currently answered through meta-analysis)Can Transformers and RNNs be replaced by faster state-space models (which are linear across time / sequence length) without sacrificing expressive power?I present solutions to both. For (1), I adapt conformal prediction to meta-analysis, which may be thought of as a regression from treatment and population features (e.g. "800mg of amiodarone for atrial fibrillation patients") to treatment effect (e.g. "60% chance of reversion to normal heart rhythm"). By using conformal prediction to safely incorporate untrusted data (i.e. observational studies and other background information), this complex regression problem can be satisfactorily addressed even with a small number of randomized controlled trials. The main technical challenges are computationally simplifying full conformal prediction (which is necessary due to the small number of trials) and handling noisy observations (due to the limited number of participants in each trial). For (2), I present a general scheme by which nonlinearity across time can be replaced by nonlinearity along depth. This involves stacking linear systems with interposed local corrections. This scheme is fast, constructive, involves no additional parameters, provably converges even in the worst case, and empirically exhibits fast convergence. It can be used to practically develop fast sequence models and to theoretically understand the power of depth. Both of these solutions exemplify a broader thesis of developing syntheses between classical machine learning techniques (such as meta-analytic averaging or linear dynamical systems) and modern approaches (such as deep nonlinear regression or Transformers). The high-level goal is to combine the safety and tractability of classical approaches with the accuracy of modern ones through a close (and sometimes surprising) examination of their technical relationship. Thesis Committee: Geoffrey Gordon (Chair)Zachary LiptonAditi RaghunathanRyan Tibshirani (University of California, Berkeley) Event Website: https://csd.cmu.edu/calendar/thesis-oral-defense-shiva-kaul Add event to Google Add event to iCal